Challenges in Creating a Taxonomy for Genres of Digital Documents

ثبت نشده
چکیده

Introduction: We report on one phase of a project whose aim is to discover whether and how identifying the genres of digital documents helps in a variety of information-seeking tasks (Crowston & Kwa nik, 200507). The project has three phases: I. Harvesting and identifying a test-set of webpages from journalists, teachers, and engineers, three groups that share a discourse community in which a set of identifiable tasks and genres may play a role and in which the identification of the genre of a document is likely to be important for their tasks. For each webpage we ask the respondent to identify the task, the type of Webpage (genre), the clues the respondent used to identify the genre, and the usefulness of that document to their task. II. In the second phase, presently underway, we attempt to build a faceted taxonomy of the genres identified in Phase I. This is the phase on which we focus in this paper. III. In the final phase we will test the utility of including genre information. Using the taxonomy created in Phase II, we will manipulate a simulated search environment to test the effect of genre identification on such tasks as query formulation, searching, and processing of search output.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analyzing registry, log files, and prefetch files in finding digital evidence in graphic design applications

The products of graphic design applications leave behind traces of digital information which can be used during a digital forensic investigation in cases where counterfeit documents have been created. This paper analyzes the digital forensics involved in the creation of counterfeit documents. This is achieved by first recognizing the digital forensic artifacts left behind from the use of graphi...

متن کامل

Genre-based Metadata for Enterprise Document Management

Contemporary challenges for enterprise document management (EDM) include managing a mixture of technologies, recognizing the needs of several user roles and groups, and pursuing effective processes utilizing documents in digital form. Responding to these challenges means gathering and scrutinizing organizational metadata describing the organization's information resources. Despite the volume of...

متن کامل

شناسایی خودکار سبک موسیقی

Nowadays, automatic analysis of music signals has gained a considerable importance due to the growing amount of music data found on the Web. Music genre classification is one of the interesting research areas in music information retrieval systems. In this paper several techniques were implemented and evaluated for music genre classification including feature extraction, feature selection and m...

متن کامل

Thesis Stereotyping the Web: Genre Classification of Web Documents

OF THESIS STEREOTYPING THE WEB: GENRE CLASSIFICATION OF WEB DOCUMENTS Retrieving relevant documents over the Web is a difficult task. Currently, search engines rely on keywords for matching documents to user queries. This paper explores the potential for discriminating documents based on the genre of the document. I define genre as a taxonomy that incorporates the style, form and content of a d...

متن کامل

Internet Genres

Rhetoricians since Aristotle have attempted to classify communications or documents into categories or “genres” with similar form, topic or purpose. This article surveys research on genre as it relates to Internet documents. The article briefly presents the concept of genre in general, and then reviews the evolution and emergence of genres on the Internet. It concludes with an examination of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005